feat(ai): add $ai_completion_id and $ai_provider_metadata to OpenAI events by johnsykim · Pull Request #3306 · PostHog/posthog-js

johnsykim · 2026-03-31T21:27:47Z

Summary

Adds two new auto-captured properties to $ai_generation events for the OpenAI and Azure OpenAI wrappers, enabling correlation between PostHog events and OpenAI's Logs dashboard (platform.openai.com/logs/{completion_id}):

$ai_completion_id — provider-assigned response ID (chatcmpl-…, resp_…). Lives at the top of the $ai_* namespace because response IDs generalize across providers.
$ai_provider_metadata — OpenAI-specific fields (system_fingerprint, request_id) collected under a single blob rather than polluting the shared, provider-agnostic schema. Sets the pattern for Anthropic / Gemini wrappers to follow.

Both options are now accepted by the public captureAiGeneration primitive, so external instrumentation produces identical events.

Revision notes (rebased against the new `captureAiGeneration` primitive)

This branch was originally written against sendEventToPosthog. Since #3499 landed, every wrapper funnels through captureAiGeneration, so the implementation was rewritten cleanly on top of the new primitive.

Addresses Richard's review:

✅ Move OpenAI-specific fields under $ai_provider_metadata. system_fingerprint and request_id are OpenAI-only concepts and now live under $ai_provider_metadata instead of at the top level.
✅ Keep $ai_completion_id at the top level. Response IDs generalize across providers, so this stays in the shared schema.
✅ Fix the TS2353 build error. mockOpenAiChatResponse is now typed as ChatCompletion & { _request_id?: string }. pnpm build passes.
✅ Consolidate the duplicated (result as any)._request_id cast. Replaced all six call sites with a single extractRequestId(result) helper in packages/ai/src/openai/utils.ts, with an explanatory doc comment in one place.
⏭️ Streaming request_id is deferred to a follow-up as suggested — it requires threading .withResponse() through the streaming flow, which is a larger refactor than this PR. Streaming chats still capture $ai_completion_id and system_fingerprint from accumulated chunks.

Also addresses Greptile's outstanding comment about Responses API request_id coverage: the responses.parse and responses.create web-search tests both now exercise the positive case.

What it captures, by path

Path	`$ai_completion_id`	`$ai_provider_metadata`
Chat Completions — non-streaming	`result.id`	`system_fingerprint`, `request_id`
Chat Completions — streaming	accumulated `chunk.id`	`system_fingerprint` (from chunks)
Responses API — non-streaming (`create`)	`result.id`	`request_id`
Responses API — streaming	accumulated `chunk.response.id`	—
Responses API — `parse`	`result.id`	`request_id`

Same matrix for the Azure OpenAI wrapper.

Test plan

pnpm lint clean
pnpm build clean (TS2353 fixed)
New unit tests for the extractRequestId / buildProviderMetadata helpers (packages/ai/tests/openai-utils.test.ts) — 11 cases including the "empty → undefined" branch
OpenAI basic completion asserts $ai_completion_id + $ai_provider_metadata with both fields
OpenAI basic streaming completion asserts $ai_completion_id + $ai_provider_metadata with system_fingerprint only
OpenAI responses parse asserts $ai_completion_id + $ai_provider_metadata.request_id
OpenAI responses.create (web-search test) asserts $ai_completion_id + $ai_provider_metadata.request_id
Azure OpenAI basic completion mirrors the OpenAI assertions

Original background & motivation

Why this matters

The @posthog/ai package wraps the OpenAI SDK and auto-emits $ai_generation events. Without response.id, there's no way to navigate from a PostHog $ai_generation event to the corresponding entry in OpenAI's Logs dashboard (platform.openai.com/logs/{completion_id}).

Debugging. When investigating a cost spike or error, engineers need to jump from PostHog analytics → OpenAI's detailed log (full prompt/response, token breakdown, latency).
Correlation. Without response.id, joining PostHog events with OpenAI's own logging system requires multi-hop lookups via entity IDs.
Support. OpenAI support tickets reference x-request-id — having this on the PostHog event makes it trivial to file tickets with the right correlation ID.

References

OpenAI Logs dashboard: https://platform.openai.com/logs?api=chat-completions
Completion ID formats: chatcmpl-{base62} (Chat Completions), resp_{base62} (Responses)
OpenAI SDK _request_id: attached from the x-request-id response header by the openai npm package

vercel · 2026-03-31T21:27:53Z

@johnsykim is attempting to deploy a commit to the PostHog Team on Vercel.

A member of the Team first needs to authorize it.

greptile-apps · 2026-03-31T21:33:59Z

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/ai/tests/openai-utils.test.ts:29-38
These two test cases each contain multiple inline assertions that should be parameterized. The project explicitly prefers `it.each` for multi-case checks, and the `extractRequestId` tests in the same file already follow this pattern correctly.

```suggestion
  it.each([
    [{ systemFingerprint: 'fp_1' }, { system_fingerprint: 'fp_1' }],
    [{ requestId: 'req_1' }, { request_id: 'req_1' }],
  ])('omits keys whose value is missing: %p → %p', (input, expected) => {
    expect(buildProviderMetadata(input)).toEqual(expected)
  })

  it.each([
    [{}],
    [{ systemFingerprint: undefined, requestId: undefined }],
    [{ systemFingerprint: null, requestId: null }],
  ])('returns undefined when there is nothing to report: %p', (input) => {
    expect(buildProviderMetadata(input)).toBeUndefined()
  })
```

_{Reviews (4): Last reviewed commit: "feat(ai): add $ai_completion_id and $ai_..." | Re-trigger Greptile}

johnsykim · 2026-03-31T21:43:58Z

@greptile review

github-actions · 2026-04-08T09:51:02Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

johnsykim · 2026-04-09T18:36:40Z

@haacked @danielbachhuber Sorry for random tagging (please tell me if there is a better PostHog POC to reach out to). What's the best way to get PostHog team's attention on this PR? Is there a certain review and release process I have to follow?

danielbachhuber · 2026-04-09T20:39:35Z

@johnsykim Hi! I'm no longer with PostHog but, based on the history of the files, @richardsolomou, @carlos-marchal-ph, or @Radu-Raicea might be able to give you a review.

github-actions · 2026-04-17T09:56:43Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

richardsolomou

Thanks for putting this together — the motivation (correlating PostHog events to OpenAI's logs dashboard) is a real pain point and the implementation is clean. Chatted with the team about the schema shape, and we'd like to adjust the approach before merging.

Design ask

Move OpenAI-specific fields under a $ai_provider_metadata blob — $ai_* has been provider-agnostic so far, and system_fingerprint / request_id are OpenAI-only concepts. Rather than living at the top level, they should go under a new $ai_provider_metadata property so each provider wrapper can surface its own metadata without polluting the shared namespace. Something like:
```
$ai_provider_metadata: {
  system_fingerprint: '...',
  request_id: '...',
}
```
This also sets the pattern for Anthropic / Gemini wrappers to follow later.
Keep $ai_completion_id at the top level — response IDs generalize cleanly (Anthropic and Gemini both have equivalents), so this one belongs in the shared schema.

On the app side: we think rendering a link to the OpenAI log from the event inspector is valuable and we'll pick that up separately once this is merged!

Blocking

Build fails with TS2353 on packages/ai/tests/openai.test.ts:292 — _request_id isn't a public property of ChatCompletion, so the literal is rejected under strict TS. pnpm build fails on this branch and passes on main. CI for external PRs only runs Wiz/Graphite/Vercel, which is why this slipped past. Easiest fix:
```
let mockOpenAiChatResponse: ChatCompletion & { _request_id?: string } = {} as ChatCompletion
```

Suggestions

Duplicated (result as any)._request_id pattern — appears 6 times across packages/ai/src/openai/index.ts and packages/ai/src/openai/azure.ts. A small helper (extractRequestId(result)) would consolidate the cast and the explanatory comment in one place.
Streaming paths don't capture requestId — only completionId / systemFingerprint are pulled from chunks. The OpenAI SDK exposes _request_id on the aggregated stream response (via .withResponse()), so there's room to capture it for streams too. Fine as a follow-up.

Let me know if anything's unclear — happy to answer questions as you work through the restructure.

github-actions · 2026-04-29T10:13:47Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

johnsykim · 2026-04-29T19:06:34Z

Thanks a lot @richardsolomou. I'll raise a revision sometime soon.

github-actions · 2026-05-07T10:16:33Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

carlos-marchal-ph · 2026-05-08T11:32:13Z

Hi @johnsykim, taking this over from @richardsolomou! I agree with Richard's review. I'm going to mark this as a draft in the meantime, un-draft it yourself when it's ready for us again, and we'll get pinged to take a second look :)

github-actions · 2026-05-18T10:49:30Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

…vents Adds two new auto-captured properties to `$ai_generation` events emitted by the OpenAI and Azure OpenAI wrappers, enabling direct correlation between PostHog events and OpenAI's Logs dashboard (`platform.openai.com/logs/{completion_id}`): - `$ai_completion_id` — provider-assigned response ID (e.g. `chatcmpl-…`, `resp_…`). Generalises across providers, so it lives at the top of the `$ai_*` namespace. - `$ai_provider_metadata` — OpenAI-specific fields (`system_fingerprint`, `request_id`) collected under a single blob rather than polluting the shared schema. Sets the pattern for Anthropic / Gemini wrappers to follow. Both options are accepted by the public `captureAiGeneration` primitive, so external instrumentation produces identical events. Coverage spans non-streaming and streaming Chat Completions plus Responses API (`create` + `parse`) for both OpenAI and Azure OpenAI. Streaming paths extract `id` / `system_fingerprint` from accumulated chunks; the `x-request-id` header is read via the OpenAI SDK's semi-private `_request_id` field through a small typed helper. Streaming `request_id` capture (which needs `.withResponse()`) is left as a follow-up — see review thread on PostHog#3306. Addresses review feedback on PostHog#3306: - restructure `system_fingerprint` / `request_id` under `$ai_provider_metadata` - fix the TS2353 build error on the test mock by widening `ChatCompletion` with `{ _request_id?: string }` - consolidate the duplicated `(result as any)._request_id` cast into `extractRequestId` with an explanatory comment in one place - add Responses API `$ai_completion_id` / `request_id` test coverage Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

johnsykim · 2026-05-19T23:02:31Z

@carlos-marchal-ph @richardsolomou Revision is up, ready for review again. Brief summary of what changed:

Rebased onto the new captureAiGeneration primitive (sendEventToPosthog no longer exists since feat(posthog-ai): add captureAiGeneration as canonical capture primitive #3499), so the implementation is rewritten cleanly on top of that. completionId and providerMetadata are now options on CaptureAiGenerationOptions, available to anyone calling the primitive directly.
system_fingerprint and request_id moved under $ai_provider_metadata as Richard suggested. $ai_completion_id stays at the top level since response IDs generalize across providers.
TS2353 build error fixed. pnpm build and pnpm lint both pass.
(result as any)._request_id consolidated into extractRequestId with a doc comment in one place (packages/ai/src/openai/utils.ts).
Streaming request_id is left as a follow-up as you suggested — needs .withResponse() threading.
Test coverage extended: dedicated unit tests for the new helpers, $ai_completion_id + $ai_provider_metadata asserted on non-streaming chat, streaming chat, responses.parse, responses.create (non-streaming), and the Azure equivalent. This also closes Greptile's outstanding comment about missing Responses API request_id coverage.

Full revision notes in the updated PR description above. Thanks for hanging in there while I got around to this!

johnsykim · 2026-05-19T23:04:33Z

@greptile review

greptile-apps · 2026-05-19T23:07:55Z

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/ai/tests/openai-utils.test.ts:29-38
**Non-parameterised test blocks bundle multiple independent cases**

The `buildProviderMetadata` tests pack multiple distinct inputs into the same `it` block. Per the repo's simplicity rule ("we always prefer parameterised tests"), each input/output pair should be its own case in an `it.each` table. As written, a failure only points to the block, not the specific input that broke.

For example, "omits keys whose value is missing" tests `{ systemFingerprint }` and `{ requestId }` separately, and "returns undefined when there is nothing to report" tests `{}`, `{ …undefined }`, and `{ …null }` — these are three distinct edge-cases that would be clearer as three rows in an `it.each`.

### Issue 2 of 2
packages/ai/src/openai/index.ts:287-305
**Error capture path in streaming skips `completionId` even when it is available**

`completionIdFromResponse` and `systemFingerprintFromResponse` are declared in the outer scope and may have been populated by one or more chunks before the exception fires. However, the `catch` block that emits the error event never passes these values. When the stream fails mid-flight (after the first chunk carrying `chunk.id`), the error event has no `$ai_completion_id`, making it impossible to correlate the error against OpenAI's Logs dashboard — the exact use-case this PR enables.

The same gap exists in the equivalent streaming catch block in `azure.ts`.

_{Reviews (5): Last reviewed commit: "feat(ai): add $ai_completion_id and $ai_..." | Re-trigger Greptile}

- Streaming error path now surfaces accumulated completion metadata. When a stream fails mid-flight after consuming chunks that carried `chunk.id` / `chunk.system_fingerprint`, the error event was being emitted without `$ai_completion_id` / `$ai_provider_metadata` — the exact correlation IDs this PR is meant to enable. Hoisted the accumulators above the `try` block and pass them in the `catch` capture for both `index.ts` and `azure.ts`, chat and Responses. - Convert the `openai-utils` tests to parameterised `it.each` tables per the repo's "always prefer parameterised tests" convention; each input/output pair is now its own row so a failure points to the specific case that broke.

johnsykim · 2026-05-19T23:14:07Z

Addressed both Greptile points on 1547db60:

Streaming error path now surfaces accumulated completion metadata. Hoisted completionIdFromResponse / systemFingerprintFromResponse above the try and pass them in the catch capture (index.ts + azure.ts, chat and Responses). Mid-flight stream failures now carry whatever correlation IDs the consumed chunks already provided — the exact case Greptile flagged.
Parameterised the openai-utils tests. Converted both describe blocks to it.each tables so each input/output pair is its own row.

@greptile review

greptile-apps · 2026-05-19T23:16:31Z

_{Reviews (6): Last reviewed commit: "fix(ai): address Greptile feedback on th..." | Re-trigger Greptile}

greptile-apps · 2026-05-19T23:21:41Z

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
packages/ai/tests/openai-utils.test.ts:11-14
`extractRequestId` returns the raw value of `_request_id` via `?? undefined`, which converts `null` but **not** an empty string. So `extractRequestId({ _request_id: '' })` returns `''`. `buildProviderMetadata` then silently drops it (truthy check), so behavior is still correct — but the interaction isn't exercised anywhere in the test suite. Adding the case documents the contract explicitly and guards against a future change to the truthy check silently breaking it.

```suggestion
    ['returns undefined for numeric input', 42, undefined],
    ['returns empty string when `_request_id` is empty string', { _request_id: '' }, ''],
  ])('%s', (_name, input, expected) => {
    expect(extractRequestId(input)).toBe(expected)
  })
```

_{Reviews (7): Last reviewed commit: "fix(ai): address Greptile feedback on th..." | Re-trigger Greptile}

github-actions · 2026-05-27T10:46:53Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

johnsykim · 2026-05-27T23:51:51Z

@carlos-marchal-ph Let me know if the PR is good to go!

Generated-By: PostHog Code Task-Id: 672e907b-e741-4f0e-abf2-76722b296982

carlos-marchal-ph

Looks good, just pushed a tiny style fix and will go ahead and merge it!

greptile-apps Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread packages/ai/src/openai/index.ts Outdated

Comment thread packages/ai/src/openai/index.ts Outdated

johnsykim marked this pull request as draft March 31, 2026 21:36

johnsykim marked this pull request as ready for review March 31, 2026 21:58

github-actions Bot added the stale label Apr 8, 2026

github-actions Bot removed the stale label Apr 10, 2026

github-actions Bot added the stale label Apr 17, 2026

richardsolomou removed the stale label Apr 17, 2026

richardsolomou requested changes Apr 21, 2026

View reviewed changes

github-actions Bot added the stale label Apr 29, 2026

richardsolomou removed the stale label Apr 29, 2026

github-actions Bot added the stale label May 7, 2026

richardsolomou requested a review from a team May 8, 2026 07:55

github-actions Bot removed the stale label May 8, 2026

carlos-marchal-ph marked this pull request as draft May 8, 2026 11:32

github-actions Bot added the stale label May 18, 2026

johnsykim force-pushed the feat/ai-completion-id branch from ce4c721 to bf5b13a Compare May 19, 2026 23:01

johnsykim changed the title ~~feat(ai): add $ai_completion_id, $ai_system_fingerprint, $ai_request_id to OpenAI events~~ feat(ai): add $ai_completion_id and $ai_provider_metadata to OpenAI events May 19, 2026

johnsykim marked this pull request as ready for review May 19, 2026 23:02

johnsykim marked this pull request as draft May 19, 2026 23:08

johnsykim marked this pull request as ready for review May 19, 2026 23:18

github-actions Bot removed the stale label May 20, 2026

github-actions Bot added the stale label May 27, 2026

github-actions Bot removed the stale label May 28, 2026

fix(aiobs): align system_fingerprint guard with surrounding style

929a949

Generated-By: PostHog Code Task-Id: 672e907b-e741-4f0e-abf2-76722b296982

carlos-marchal-ph approved these changes May 28, 2026

View reviewed changes

carlos-marchal-ph requested review from richardsolomou and removed request for richardsolomou May 28, 2026 16:13

richardsolomou approved these changes May 28, 2026

View reviewed changes

Merge branch 'main' into feat/ai-completion-id

fdcf2ed

carlos-marchal-ph merged commit 1a2f8a8 into PostHog:main May 29, 2026
44 of 48 checks passed

Conversation

johnsykim commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Revision notes (rebased against the new captureAiGeneration primitive)

What it captures, by path

Test plan

Why this matters

References

Uh oh!

vercel Bot commented Mar 31, 2026

Uh oh!

greptile-apps Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

johnsykim commented Mar 31, 2026

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

johnsykim commented Apr 9, 2026

Uh oh!

danielbachhuber commented Apr 9, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

richardsolomou left a comment

Choose a reason for hiding this comment

Design ask

Blocking

Suggestions

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

johnsykim commented Apr 29, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

carlos-marchal-ph commented May 8, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

johnsykim commented May 19, 2026

Uh oh!

johnsykim commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026

Uh oh!

johnsykim commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

johnsykim commented May 27, 2026

Uh oh!

carlos-marchal-ph left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

johnsykim commented Mar 31, 2026 •

edited

Loading

Revision notes (rebased against the new `captureAiGeneration` primitive)

greptile-apps Bot commented Mar 31, 2026 •

edited

Loading